智能论文笔记

From Judgement's Premises Towards Key Points

Oren Sultan , Rayen Dhahri , Yauheni Mardan , Tobias Eder , Georg Groh

分类：自然语言处理 | 人工智能

2022-12-23

Key Point Analysis(KPA) is a relatively new task in NLP that combines summarization and classification by extracting argumentative key points (KPs) for a topic from a collection of texts and categorizing their closeness to the different arguments. In our work, we focus on the legal domain and develop methods that identify and extract KPs from premises derived from texts of judgments. The first method is an adaptation to an existing state-of-the-art method, and the two others are new methods that we developed from scratch. We present our methods and examples of their outputs, as well a comparison between them. The full evaluation of our results is done in the matching task -- match between the generated KPs to arguments (premises).

translated by 谷歌翻译

Moving Beyond Downstream Task Accuracy for Information Retrieval Benchmarking

Keshav Santhanam , Jon Saad-Falcon , Martin Franz , Omar Khattab , Avirup Sil , Radu Florian , Md Arafat Sultan , Salim Roukos , Matei Zaharia , Christopher Potts

分类：自然语言处理

2022-12-02

Neural information retrieval (IR) systems have progressed rapidly in recent years, in large part due to the release of publicly available benchmarking tasks. Unfortunately, some dimensions of this progress are illusory: the majority of the popular IR benchmarks today focus exclusively on downstream task accuracy and thus conceal the costs incurred by systems that trade away efficiency for quality. Latency, hardware cost, and other efficiency considerations are paramount to the deployment of IR systems in user-facing settings. We propose that IR benchmarks structure their evaluation methodology to include not only metrics of accuracy, but also efficiency considerations such as a query latency and the corresponding cost budget for a reproducible hardware setting. For the popular IR benchmarks MS MARCO and XOR-TyDi, we show how the best choice of IR system varies according to how these efficiency considerations are chosen and weighed. We hope that future benchmarks will adopt these guidelines toward more holistic IR evaluation.

translated by 谷歌翻译

DEL-Dock: Molecular Docking-Enabled Modeling of DNA-Encoded Libraries

Kirill Shmilovich , Benson Chen , Theofanis Karaletos , Mohammad M. Sultan

分类：机器学习

2022-11-30

DNA-Encoded Library (DEL) technology has enabled significant advances in hit identification by enabling efficient testing of combinatorially-generated molecular libraries. DEL screens measure protein binding affinity though sequencing reads of molecules tagged with unique DNA-barcodes that survive a series of selection experiments. Computational models have been deployed to learn the latent binding affinities that are correlated to the sequenced count data; however, this correlation is often obfuscated by various sources of noise introduced in its complicated data-generation process. In order to denoise DEL count data and screen for molecules with good binding affinity, computational models require the correct assumptions in their modeling structure to capture the correct signals underlying the data. Recent advances in DEL models have focused on probabilistic formulations of count data, but existing approaches have thus far been limited to only utilizing 2-D molecule-level representations. We introduce a new paradigm, DEL-Dock, that combines ligand-based descriptors with 3-D spatial information from docked protein-ligand complexes. 3-D spatial information allows our model to learn over the actual binding modality rather than using only structured-based information of the ligand. We show that our model is capable of effectively denoising DEL count data to predict molecule enrichment scores that are better correlated with experimental binding affinity measurements compared to prior works. Moreover, by learning over a collection of docked poses we demonstrate that our model, trained only on DEL data, implicitly learns to perform good docking pose selection without requiring external supervision from expensive-to-source protein crystal structures.

translated by 谷歌翻译

SPARTAN: Sparse Hierarchical Memory for Parameter-Efficient Transformers

Ameet Deshpande , Md Arafat Sultan , Anthony Ferritto , Ashwin Kalyan , Karthik Narasimhan , Avirup Sil

分类：自然语言处理 | 人工智能 | 机器学习

2022-11-29

Fine-tuning pre-trained language models (PLMs) achieves impressive performance on a range of downstream tasks, and their sizes have consequently been getting bigger. Since a different copy of the model is required for each task, this paradigm is infeasible for storage-constrained edge devices like mobile phones. In this paper, we propose SPARTAN, a parameter efficient (PE) and computationally fast architecture for edge devices that adds hierarchically organized sparse memory after each Transformer layer. SPARTAN freezes the PLM parameters and fine-tunes only its memory, thus significantly reducing storage costs by re-using the PLM backbone for different tasks. SPARTAN contains two levels of memory, with only a sparse subset of parents being chosen in the first level for each input, and children cells corresponding to those parents being used to compute an output representation. This sparsity combined with other architecture optimizations improves SPARTAN's throughput by over 90% during inference on a Raspberry Pi 4 when compared to PE baselines (adapters) while also outperforming the latter by 0.1 points on the GLUE benchmark. Further, it can be trained 34% faster in a few-shot setting, while performing within 0.9 points of adapters. Qualitative analysis shows that different parent cells in SPARTAN specialize in different topics, thus dividing responsibility efficiently.

translated by 谷歌翻译

Inferring Attack Relations for Gradual Semantics

Nir Oren , Bruno Yun

分类：人工智能

2022-11-29

A gradual semantics takes a weighted argumentation framework as input and outputs a final acceptability degree for each argument, with different semantics performing the computation in different manners. In this work, we consider the problem of attack inference. That is, given a gradual semantics, a set of arguments with associated initial weights, and the final desirable acceptability degrees associated with each argument, we seek to determine whether there is a set of attacks on those arguments such that we can obtain these acceptability degrees. The main contribution of our work is to demonstrate that the associated decision problem, i.e., whether a set of attacks can exist which allows the final acceptability degrees to occur for given initial weights, is NP-complete for the weighted h-categoriser and cardinality-based semantics, and is polynomial for the weighted max-based semantics, even for the complete version of the problem (where all initial weights and final acceptability degrees are known). We then briefly discuss how this decision problem can be modified to find the attacks themselves and conclude by examining the partial problem where not all initial weights or final acceptability degrees may be known.

translated by 谷歌翻译

Re-Analyze Gauss: Bounds for Private Matrix Approximation via Dyson Brownian Motion

Oren Mangoubi , Nisheeth K. Vishnoi

分类：机器学习 | (统计)机器学习

2022-11-11

Given a symmetric matrix $M$ and a vector $\lambda$, we present new bounds on the Frobenius-distance utility of the Gaussian mechanism for approximating $M$ by a matrix whose spectrum is $\lambda$, under $(\varepsilon,\delta)$-differential privacy. Our bounds depend on both $\lambda$ and the gaps in the eigenvalues of $M$, and hold whenever the top $k+1$ eigenvalues of $M$ have sufficiently large gaps. When applied to the problems of private rank-$k$ covariance matrix approximation and subspace recovery, our bounds yield improvements over previous bounds. Our bounds are obtained by viewing the addition of Gaussian noise as a continuous-time matrix Brownian motion. This viewpoint allows us to track the evolution of eigenvalues and eigenvectors of the matrix, which are governed by stochastic differential equations discovered by Dyson. These equations allow us to bound the utility as the square-root of a sum-of-squares of perturbations to the eigenvectors, as opposed to a sum of perturbation bounds obtained via Davis-Kahan-type theorems.

translated by 谷歌翻译

Wall Street Tree Search: Risk-Aware Planning for Offline Reinforcement Learning

Dan Elbaz , Gal Novik , Oren Salzman

分类：机器学习 | 人工智能 | 机器人

2022-11-06

Offline reinforcement-learning (RL) algorithms learn to make decisions using a given, fixed training dataset without the possibility of additional online data collection. This problem setting is captivating because it holds the promise of utilizing previously collected datasets without any costly or risky interaction with the environment. However, this promise also bears the drawback of this setting. The restricted dataset induces subjective uncertainty because the agent can encounter unfamiliar sequences of states and actions that the training data did not cover. Moreover, inherent system stochasticity further increases uncertainty and aggravates the offline RL problem, preventing the agent from learning an optimal policy. To mitigate the destructive uncertainty effects, we need to balance the aspiration to take reward-maximizing actions with the incurred risk due to incorrect ones. In financial economics, modern portfolio theory (MPT) is a method that risk-averse investors can use to construct diversified portfolios that maximize their returns without unacceptable levels of risk. We integrate MPT into the agent's decision-making process to present a simple-yet-highly-effective risk-aware planning algorithm for offline RL. Our algorithm allows us to systematically account for the \emph{estimated quality} of specific actions and their \emph{estimated risk} due to the uncertainty. We show that our approach can be coupled with the Transformer architecture to yield a state-of-the-art planner for offline RL tasks, maximizing the return while significantly reducing the variance.

translated by 谷歌翻译

Efficient Few-Shot Learning Without Prompts

Lewis Tunstall , Nils Reimers , Unso Eun Seo Jo , Luke Bates , Daniel Korat , Moshe Wasserblat , Oren Pereg

分类：自然语言处理

2022-09-22

最近的几种方法，例如参数有效的微调（PEFT）和模式开发训练（PET），在标签筛选设置中取得了令人印象深刻的结果。但是，它们很难使用，因为它们会受到手动制作的提示的高度可变性，并且通常需要十亿参数语言模型才能达到高精度。为了解决这些缺点，我们提出了SETFIT（句子变压器微调），这是一个有效且迅速的框架，用于对句子变形金刚（ST）进行几次微调。 SetFit首先以对比的暹罗方式对少数文本对进行微调验证的st。然后将所得模型用于生成丰富的文本嵌入，这些嵌入方式用于训练分类头。这个简单的框架不需要任何提示或口头化，并且比现有技术少的参数较少，因此可以实现高精度。我们的实验表明，SetFit通过PEFT和PET技术获得了可比的结果，同时训练的速度更快。我们还表明，SETFIT可以通过简单地切换ST主体来应用于多语言设置。我们的代码可从https://github.com/huggingface/setFit以及我们的数据集获得，网址为https://huggingface.co/setfit。

translated by 谷歌翻译

Deep Learning on Home Drone: Searching for the Optimal Architecture

Alaa Maalouf , Yotam Gurfinkel , Barak Diker , Oren Gal , Daniela Rus , Dan Feldman

分类：计算机视觉 | 机器学习 | 机器人

2022-09-21

我们建议第一个通过对弱的微型计算机进行深入学习的实时语义细分的系统，例如Raspberry Pi Zero Zero V2（其价格\ 15美元）附加到玩具无人机上。特别是，由于Raspberry Pi的重量不到$ 16 $，并且其大小是信用卡的一半，因此我们可以轻松地将其连接到普通的商业DJI Tello玩具器中（<\ $ 100，<90克，98 $ \ \时间$ 92.5 $ \ times $ 41毫米）。结果是可以从板载单眼RGB摄像头（无GPS或LIDAR传感器）实时检测和分类对象的自动无人机（无笔记本电脑或人类）。伴侣视频展示了这款Tello无人机如何扫描实验室的人（例如使用消防员或安全部队）以及在实验室外的空停车位。现有的深度学习解决方案要么在这种物联网设备上实时计算要么太慢，要么提供不切实际的质量结果。我们的主要挑战是设计一个系统，该系统在网络，深度学习平台/框架，压缩技术和压缩比的众多组合中占有最好的选择。为此，我们提供了一种有效的搜索算法，旨在找到最佳组合，从而导致网络运行时间与其准确性/性能之间的最佳权衡。

translated by 谷歌翻译

A Deep Moving-camera Background Model

Guy Erez , Ron Shapira Weber , Oren Freifeld

分类：计算机视觉

2022-09-16

在视频分析中，背景模型具有许多应用，例如背景/前景分离，变更检测，异常检测，跟踪等。但是，尽管在静态相机捕获的视频中学习这种模型是一项公认的任务，但在移动相机背景模型（MCBM）的情况下，由于算法和可伸缩性挑战，成功率更加重要。由于相机运动而产生。因此，现有的MCBM在其范围和受支持的摄像头类型的限制中受到限制。这些障碍还阻碍了基于深度学习（DL）的端到端解决方案的这项无监督的任务。此外，现有的MCBM通常会在典型的大型全景图像或以在线方式的域名上建模背景。不幸的是，前者造成了几个问题，包括可扩展性差，而后者则阻止了对摄像机重新审视场景先前看到部分的案例的识别和利用。本文提出了一种称为DEEPMCBM的新方法，该方法消除了上述所有问题并实现最新结果。具体而言，首先，我们确定与一般和DL设置的视频帧联合对齐相关的困难。接下来，我们提出了一种新的联合一致性策略，使我们可以使用具有正则化的空间变压器网，也不是任何形式的专业化（且不差异）的初始化。再加上在不破坏的稳健中央矩（从关节对齐中获得）的自动编码器，这产生了一个无端到端的无端正规化MCBM，该MCBM支持广泛的摄像机运动并优雅地缩放。我们在各种视频上展示了DEEPMCBM的实用程序，包括超出其他方法范围的视频。我们的代码可在https://github.com/bgu-cs-vil/deepmcbm上找到。

translated by 谷歌翻译